Standard Memory Hierarchy Does Not Fit Simultaneous Multithreading

نویسنده

  • S ebastien Hily
چکیده

Simultaneous multithreading (SMT) is a promizing approach in maximizing performance by enhancing processor utilization. We investigate issues involving the behavior of the memory hierarchy with SMT. First, we show that ignoring L2 cache contention leads to strongly overestimate the performance one can expect and may lead to incorrect conclusions. We then explore the impact of various memory hierarchy parameters. We show that the number of supported threads has to be setup according to the cache size, that the L1 caches have to be associative and small blocks have to be used. Then, the hardware constraints put on the design of memory hierarchies should limit the interest of SMT to a few threads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Memory Subsystem Design for Multithreaded Processors

Multithreading processors pose new challenges and new opportunities for cache/memory hierarchy design. Multithreading significantly alters the data reference stream seen by the memory subsystem. Multithreading also demands very different performance characteristics from the cache hierarchy than a typical (uniprocessor) CPU. This paper is specifically concerned with memory hierarchy design consi...

متن کامل

Memory Hierarchy Studies of Multimedia-enhanced Simultaneous Multithreaded Processors for MPEG-2 Video Decompression

This paper explores cache models for a simultaneous multithreaded processor with multimedia enhancements. We start with a wide-issue superscalar processor, enhance it by the simultaneous multithreading (SMT) technique, by multimedia units, and by an additional on-chip RAM storage. Our workload is a multithreaded MPEG-2 video decompression algorithm that extensively uses multimedia units. Variou...

متن کامل

Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor

Existing multiprocessor synchronization mechanisms are relatively heavyweight, due in part to the level of the memory hierarchy (typically main memory) at which threads must synchronize. Multithreaded processors, on the other hand, have the potential to significantly reduce synchronization cost, because threads share the processor simultaneously and can synchronize using processor-internal stat...

متن کامل

Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures

In this paper the problem of the locality of sparse algebra codes on simultaneous multithreading architectures is studied. In this kind of architectures many hardware structures are dynamically shared among the running threads. This puts a lot of stress on the memory hierarchy, and a poor locality, both inter-thread and intra-thread, may become a major bottleneck in the performance of a code. T...

متن کامل

Efficient Sampling Startup for Uniprocessor and Simultaneous Multithreading Simulation

Modern architecture research relies heavily on detailed pipeline simulation. Simulating the full execution of an industry standard benchmark can take weeks to months. Statistical sampling and techniques like SimPoint that pick small sets of execution samples have been shown to provide accurate results while significantly reducing simulation time. The inefficiencies in sampling are (a) needing t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998